(WIP) Batched autodiff #2181

jumerckx · 2024-11-28T15:56:02Z

Added some type conversions to tensor types if width != 1. The simple test case seems correct now.
Corresponding Enzyme-JAX PR: EnzymeAD/Enzyme-JAX#197

wsmoses · 2024-11-28T15:58:48Z

enzyme/Enzyme/MLIR/Interfaces/CloneFunction.cpp

@@ -27,7 +27,11 @@ getFunctionTypeForClone(mlir::FunctionType FTy, DerivativeMode mode,
  for (auto &&[Ty, returnPrimal, returnShadow, activity] : llvm::zip(
           FTy.getResults(), returnPrimals, returnShadows, ReturnActivity)) {
    if (returnPrimal) {
-      RetTypes.push_back(Ty);
+      if (width != 1) {
+        RetTypes.push_back(mlir::RankedTensorType::get({width}, Ty));


This shouldn’t need changing since the primal is always unmodified, only Derivatives are changed (and we should be pushing the getshadow types for those below)

Oh, then I'm confused of what batched autodiff is.
How should my testcase change?

Nvm, it clicked. It's just the shadow that's batched 😅

so here's an example from llvm vector mode for example: https://github.com/EnzymeAD/Enzyme/blob/main/enzyme/test/Enzyme/ForwardModeVector/add.ll

tho perhaps mul will be more illustrative, https://github.com/EnzymeAD/Enzyme/blob/main/enzyme/test/Enzyme/ForwardModeVector/mul.ll (and obviously feel free to look at any/all of the other examples

jumerckx · 2024-12-02T10:16:35Z

I haven't yet fully made the changes in enzyme-tblgen.cpp, and either way this just works for the simple test case.
But I added the following manually in ArithDerivatives.inc.

mlir::Value itmp = ({
  // Computing MulFOp
  auto fwdarg_0 = dif;
  auto fwdarg_1 = gutils->getNewFromOriginal(op->getOperand(1));
  if (gutils->width != 1)
  {
    fwdarg_1 = builder.create<tensor::SplatOp>(
        op.getLoc(),
        mlir::RankedTensorType::get({gutils->width},
                                    fwdarg_1.getType()),
        fwdarg_1);
  }
  builder.create<arith::MulFOp>(op.getLoc(), fwdarg_0, fwdarg_1);
});

But this is the MLIR code that is generated for this simple test:

  func.func private @fwddiffe2square(%arg0: f64, %arg1: tensor<2xf64>) -> tensor<2xf64> {
    %splat = tensor.splat %arg0 : tensor<2xf64>
    %0 = arith.mulf %arg1, %splat : tensor<2xf64>
    %splat_0 = tensor.splat %arg0 : tensor<2xf64>
    %1 = arith.mulf %arg1, %splat_0 : tensor<2xf64>
    %2 = arith.addf %0, %1 : tensor<2xf64>
    %3 = arith.mulf %arg0, %arg0 : f64
    return %2 : tensor<2xf64>
  }

This still requires changes in the tblgenerated derivative files. For example, createForwardModeTangent in MulFOpFwdDerivative could be altered like this: ``` LogicalResult createForwardModeTangent(Operation *op0, OpBuilder &builder, MGradientUtils *gutils) const { auto op = cast<arith::MulFOp>(op0); if (gutils->width != 1) { auto newop = gutils->getNewFromOriginal(op0); for (auto res : newop->getResults()) { res.setType(mlir::RankedTensorType::get({gutils->width}, res.getType())); } } gutils->eraseIfUnused(op); if (gutils->isConstantInstruction(op)) return success(); mlir::Value res = nullptr; if (!gutils->isConstantValue(op->getOperand(0))) { auto dif = gutils->invertPointerM(op->getOperand(0), builder); { mlir::Value itmp = ({ // Computing MulFOp auto fwdarg_0 = dif; dif.dump(); // TODO: gutils->makeBatched(...) auto fwdarg_1 = gutils->getNewFromOriginal(op->getOperand(1)); builder.create<arith::MulFOp>(op.getLoc(), fwdarg_0, fwdarg_1); }); itmp.dump(); if (!res) res = itmp; else { auto operandType = cast<AutoDiffTypeInterface>(res.getType()); res = operandType.createAddOp(builder, op.getLoc(), res, itmp); } } } if (!gutils->isConstantValue(op->getOperand(1))) { auto dif = gutils->invertPointerM(op->getOperand(1), builder); { mlir::Value itmp = ({ // Computing MulFOp auto fwdarg_0 = dif; dif.dump(); auto fwdarg_1 = gutils->getNewFromOriginal(op->getOperand(0)); builder.create<arith::MulFOp>(op.getLoc(), fwdarg_0, fwdarg_1); }); if (!res) res = itmp; else { auto operandType = cast<AutoDiffTypeInterface>(res.getType()); res = operandType.createAddOp(builder, op.getLoc(), res, itmp); } } } assert(res); gutils->setDiffe(op->getResult(0), res, builder); return success(); } ```

…nction call.

This reverts commit c06ed01.

wsmoses · 2024-12-20T14:38:15Z

enzyme/Enzyme/MLIR/Dialect/EnzymeOps.td

+  NOTE: Only works for scalars and *ranked* tensors for now.
+  }];
+
+  let arguments = (ins AnyType:$input, I64Attr:$width);


To support the Enzyme.batch [which is a bit more general since it takes a shape, not just a single int], do we want to make this a vararg of i64 ?

enzyme/Enzyme/MLIR/Dialect/EnzymeOps.td

wsmoses · 2024-12-22T23:07:32Z

enzyme/Enzyme/MLIR/Dialect/Ops.cpp

+void BroadcastOp::build(OpBuilder &builder, OperationState &result, Value input, llvm::SmallVector<int64_t> shape) {
+  auto shapeAttr = builder.getDenseI64ArrayAttr(shape);
+  RankedTensorType output;
+  // TODO: support things other than scalars and ranked tensors, maybe reuse getShadowType here?


yeah long term we should probably just use

Enzyme/enzyme/Enzyme/MLIR/Interfaces/AutoDiffTypeInterface.td

Line 59 in 534a285

/*methodName=*/"getShadowType",

so essentially do

auto ty = input.getType();
for (auto s : reverse(shape)) {
ty = ty.cast().getShadowType(s)
}

but for now this is fine

wsmoses · 2024-12-22T23:09:13Z

fix the format/etc then I think this is good to go!

wsmoses reviewed Nov 28, 2024

View reviewed changes

jumerckx force-pushed the batched branch from d382302 to 8a3da59 Compare December 6, 2024 16:08

jumerckx added 12 commits December 16, 2024 14:22

add code to tblgen generator, this eventually needs to be a single fu…

5e647e5

…nction call.

a test and formatting

f69f974

use tensor splatop

623fccf

remove stale enzyme-tblgen changes

f860e19

do the simple batching in enzyme-tblgen

3a3bdf8

include tensor in all AutoDiffOpInterfaceImpls

f1b2a6d

add enzyme broadcastop

396a29b

getShadowType for TensorTypeInterface

64489a5

create broadcastop in enzyme-tblgen

0ccb37f

Revert "include tensor in all AutoDiffOpInterfaceImpls"

123a985

This reverts commit c06ed01.

test

3d2911d

jumerckx mentioned this pull request Dec 17, 2024

enzyme.broadcast conversion EnzymeAD/Enzyme-JAX#197

Open

jumerckx force-pushed the batched branch from 8a3da59 to 3d2911d Compare December 17, 2024 13:06

wsmoses reviewed Dec 20, 2024

View reviewed changes

wsmoses approved these changes Dec 20, 2024

View reviewed changes

DenseI64ArrayAttr for shape instead of scalar width

8a3f3c7

wsmoses reviewed Dec 22, 2024

View reviewed changes

enzyme/Enzyme/MLIR/Dialect/EnzymeOps.td Outdated Show resolved Hide resolved

wsmoses reviewed Dec 22, 2024

View reviewed changes

wsmoses approved these changes Dec 22, 2024

View reviewed changes

jumerckx added 2 commits December 23, 2024 00:20

llvm::SmallVector --> ArrayRef

aa4e598

formatting

fa190a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(WIP) Batched autodiff #2181

(WIP) Batched autodiff #2181

jumerckx commented Nov 28, 2024 •

edited

Loading

wsmoses Nov 28, 2024

jumerckx Nov 28, 2024

jumerckx Nov 28, 2024

wsmoses Nov 28, 2024

wsmoses Nov 28, 2024

jumerckx commented Dec 2, 2024

wsmoses Dec 20, 2024

wsmoses Dec 22, 2024

wsmoses Dec 22, 2024

wsmoses commented Dec 22, 2024

(WIP) Batched autodiff #2181

Are you sure you want to change the base?

(WIP) Batched autodiff #2181

Conversation

jumerckx commented Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jumerckx commented Dec 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wsmoses commented Dec 22, 2024

jumerckx commented Nov 28, 2024 •

edited

Loading